Paginate by creation time instead of key order#96
Paginate by creation time instead of key order#96benthecarman wants to merge 1 commit intolightningdevkit:mainfrom
Conversation
|
👋 Thanks for assigning @tankyleo as a reviewer! |
defad39 to
cb70f98
Compare
|
🔔 1st Reminder Hey @tankyleo! This PR has been waiting for your review. |
tankyleo
left a comment
There was a problem hiding this comment.
We'll want to update types.rs to require this ordering too
dc205c9 to
cd32024
Compare
|
responded to review |
| message ListKeyVersionsResponse { | ||
|
|
||
| // Fetched keys and versions. | ||
| // Fetched keys and versions, ordered by creation time (oldest first). |
There was a problem hiding this comment.
To confirm here, the server is free to choose any ordering they would like in case two keys have the same timestamp.
There was a problem hiding this comment.
We have the server sort by key after timestamp. I'll add that here
There was a problem hiding this comment.
Yes, but I don't this is a requirement on the VSS API right ? Ie clients don't care what ordering is picked beyond newest first.
There was a problem hiding this comment.
I think it's better to me more explicit than less. We have to define a tiebreaker anyways so may as well put it in the docs
There was a problem hiding this comment.
Let me see this defines an API constraint right, in addition to just documentation ? For sure I see the use for documentation here, but I don't want clients to start relying on this behavior.
|
please request review when you are ready so i don't miss this PR |
6622747 to
c02dcda
Compare
| // Use this value to query for next-page of paginated `ListKeyVersions` operation, by specifying | ||
| // this value as the `page_token` in the next request. | ||
| // | ||
| // If `next_page_token` is empty (""), then the "last page" of results has been processed and |
There was a problem hiding this comment.
No, I think we were intentionally following protobuf's/Google's best practices here. https://google.aip.dev/158 states:
- Response messages for collections should define a string next_page_token field, providing the user with a page token that may be used to retrieve the next page.
- The field containing pagination results should be the first field in the message and have a field number of 1. It should be a repeated field containing a list of resources constituting a single page of results.
- If the end of the collection has been reached, the next_page_token field must be empty. This is the only way to communicate "end-of-collection" to users.
- If the end of the collection has not been reached (or if the API can not determine in time), the API must provide a next_page_token.
IMO would be good to revert and include this context in the docs.
c02dcda to
34d4241
Compare
|
🔔 1st Reminder Hey @tankyleo! This PR has been waiting for your review. |
tankyleo
left a comment
There was a problem hiding this comment.
Rushed some comments out, will be back after lunch.
| } | ||
|
|
||
| let first_page = ctx.list(None, Some(page_size), None).await?; | ||
| assert_eq!(first_page.key_versions.len(), page_size as usize); |
There was a problem hiding this comment.
Given the VSS API, the page could have length 0 here. As long as the page token is not empty, the client would be expected to make another request with the new page token.
There was a problem hiding this comment.
I'm confused what you're saying here. This is the first page we are requesting, the only way we'd a length of 0 would be if we had no items.
There was a problem hiding this comment.
Just that I think a VSS-server is within bounds if it returns a response with an empty list, but a non-empty token. The client would be expected to continue asking for pages.
I'm mostly going from this line in the docs of ListKeyVersionRequest:
/// `page_size` is used by clients to specify the maximum number of results that can be returned by
/// the server.
/// The server may further constrain the maximum number of results returned in a single page.
/// If the `page_size` is 0 or not set, the server will decide the number of results to be returned.
#[prost(int32, optional, tag = "3")]
pub page_size: ::core::option::Option<i32>,
TLDR can't assume the page you get back is the same length as the page size in your ListKeyVersionsRequest
There was a problem hiding this comment.
TLDR can't assume the page you get back is the same length as the page size in your ListKeyVersionsRequest
This is true.
if it returns a response with an empty list, but a non-empty token.
But there is no reason to respond with empty list and non-empty pagination-token. (I don't think this should happen, can add a kvstore test/assert for it if doesn't exist already.)
|
Done with this pass here, just needed to add the comment about the VSS server API constraint |
|
Adding some historical context, VSS was purposefully built to be storage engine agnostic. The VSS protocol/API/data-model doesn't enforce a specific underlying storage database. We can totally revisit this requirement and say VSS only works with postgres-like relational databases. But extending pagination to order by creation_time limits the capability to use any modern pure KV store as an underlying database engine like DynamoDB or CosmosDB. I understand the need to efficiently paginate through large result sets. A couple of alternatives that preserve storage engine agnosticism:
|
34d4241 to
e17c676
Compare
|
Fixed @tankyleo comments about tests |
|
@G8XSU Thank you for the feedback, I'll be considering the tradeoffs over the next few days. Was wondering do you have an email where I can reach you ? Feel free to ping me at "hello at leonash dot net" I was wondering if we could get the vss-client crate at https://crates.io/crates/vss-client from you. |
|
@G8XSU my perspective
in theory, yes, but ldk-node isn't using this and migrating would be a huge lift
ldk-node uses txid/payment_hash as key so we can't really use prefixes
Yes, but then the pagination isn't really useful anymore. You just end up getting a random set of payments rather an actual ordered list.
imo postgres scales really well and you're scaling past something like postgres then it's likely you aren't using the off the shelf solution anymore and will have a whole team to manage things like stuff. And being able to sort by something useful rather than just lexicographical is worth it |
0bddc2a to
c70f65e
Compare
0e4cd21 to
b67174e
Compare
There was a problem hiding this comment.
LGTM
Right, but if it's not part of the API guarantees, isn't it odd to lean on/assume that the client only wants global version. At the very least the comment is misleading, IMO.
Let's just address this comment from tnull thank you, perhaps something like "page 0 means we get to decide"
b67174e to
e618d5f
Compare
|
Updated the comment about 0 page size |
tnull
left a comment
There was a problem hiding this comment.
Generally fine by me I think, one comment.
Aside from that, I do however wonder if we should finally add a protocol version to VSS? While this is not an API breaking change, clients will have no good idea whether they can expect pagination to work when they connect to a particular backend?
@tankyleo Any thoughts?
| message ListKeyVersionsResponse { | ||
|
|
||
| // Fetched keys and versions. | ||
| // Fetched keys and versions, ordered by creation time (newest first). |
There was a problem hiding this comment.
Remind me, is there any particular reason we need to use time for this? Wouldn't we get around the need for a tie breaker if we'd simply use a monotonically increasing atomic counter instead (e.g., a Postgres BIGSERIAL column) ? Then we'd be certain than each entry has a unique value?
There was a problem hiding this comment.
Claude:
Monotonic counter vs. creation_time
Your intuition is right — a monotonically increasing counter (e.g., a Postgres BIGSERIAL column) would be strictly simpler here. The current
approach has to deal with:
- Tie-breaking: The compound condition (created_at < $3 OR (created_at = $3 AND key > $4)) in the SQL query
- A composite page token: Encoding both the timestamp and the key (0:<micros>:<key>)
- A composite index: (user_token, store_id, created_at DESC, key ASC) INCLUDE (version)
- A dedicated test just for the tie-breaking behavior
With an auto-incrementing counter (say row_id BIGSERIAL), all of that collapses to:
- WHERE row_id < $cursor ORDER BY row_id DESC LIMIT $N
- Page token = just the counter value
- Simple single-column addition to the index
- No tie-breaking needed since values are unique by definition
The only argument for creation_time would be if it carried semantic meaning the client cares about (e.g., "show me keys created after X"). But
looking at the ListKeyVersionsResponse, created_at isn't exposed to the client — it's purely an internal ordering mechanism. So you're paying the
complexity tax of timestamps without getting the semantic benefit.
There was a problem hiding this comment.
Thank you tnull for raising this point. I did some back-and-forth with claude, and yea this is interesting !
My one question at this point is backwards compat. Do you have thoughts on this point ? I'm thinking if keys created before this commit don't have strict creation order, this is OK. Perhaps we can use the VSS version you described above to encourage people to upgrade once we start relying on PaginatedKVStore in LDK Node.
@benthecarman let me know what you think.
There was a problem hiding this comment.
@tnull I ping you on the response above in case it helps bubble this up in your inbox :)
There was a problem hiding this comment.
Ah, I didn't know we could use BIGSERIAL for non-primary keys (in sqlite you can't).
Yeah this might be better. On backwards compat, we still should be able to backfill the column by sorting by creation time, however, the migration that claude generated for this is pretty big/ugly.
There was a problem hiding this comment.
Thank you tnull for raising this point. I did some back-and-forth with claude, and yea this is interesting !
My one question at this point is backwards compat. Do you have thoughts on this point ? I'm thinking if keys created before this commit don't have strict creation order, this is OK. Perhaps we can use the VSS version you described above to encourage people to upgrade once we start relying on
PaginatedKVStorein LDK Node.@benthecarman let me know what you think.
I think the backfill should still handle it for the most part? But yeah, apart from that I think it might be good to lean on the versioning byte for this going forward, and make sure we publish vss-server v0.1 (with protocol versioning support) prior to LDK Node v0.8?
Yeah i think this makes sense probably for a follow up tho |
e618d5f to
9d87571
Compare
|
Pushed to use the monotonic counter. Just did the migration by adding the column and letting postgres backfill it in scan order. Doing it ourselves by sorting by creation time got really ugly and after talking with @tankyleo, didn't seem worth it. |
9d87571 to
4c0b480
Compare
tankyleo
left a comment
There was a problem hiding this comment.
LGTM will take another pass tomorrow
|
🔔 1st Reminder Hey @tnull! This PR has been waiting for your review. |
tnull
left a comment
There was a problem hiding this comment.
Generally looks good I think, but there are some follow-up changes to we should do, now that we don't use creation time anymore.
| message ListKeyVersionsResponse { | ||
|
|
||
| // Fetched keys and versions. | ||
| // Fetched keys and versions, ordered by creation time (newest first). |
There was a problem hiding this comment.
These docs are now inaccurate, no?
There was a problem hiding this comment.
Our implementation of this contract changed, but the contract remains the same, I think this is still accurate. Same for the other comments below.
| Ok(()) | ||
| } | ||
|
|
||
| async fn list_should_return_results_ordered_by_creation_time() -> Result<(), VssError> { |
There was a problem hiding this comment.
Same here and below, the test names seems inaccurate now that we don't sort by time?
| #[derive(Clone, PartialEq, ::prost::Message)] | ||
| pub struct ListKeyVersionsResponse { | ||
| /// Fetched keys and versions. | ||
| /// Fetched keys and versions, ordered by creation time (newest first). |
| const VERSION_COLUMN: &str = "version"; | ||
| const SORT_ORDER_COLUMN: &str = "sort_order"; | ||
|
|
||
| const CURRENT_PAGE_TOKEN_VERSION: char = '0'; |
There was a problem hiding this comment.
Hmm, I think now that we don't have extra semantics, we should be good to drop the page token versioning byte and all associated logic again and simply use the sort_order as page token? (Especially given that we're discussing adding a protocol-level version, which we could also use if we'd ever find that we need to switch page token semantics again?)
There was a problem hiding this comment.
Makes sense to me yes given the upcoming protocol-level version
| VSS ships with a PostgreSQL implementation by default and can be hosted in your favorite infrastructure/cloud provider | ||
| (AWS/GCP) and its backend storage can be switched with some other implementation for KeyValueStore if needed. | ||
| (AWS/GCP). The backend storage can be switched with another implementation, but it must support ordering by creation | ||
| time, a simple key-value store is not sufficient. |
There was a problem hiding this comment.
No, we don't sort by creation time anymore. Might be good to be more accurate here, too.
|
@tnull all your comments here are wrong it seems, we still do sort by creation time, we just use a counter in the db. |
I guess it depends on how literal you want to take the term 'creation time'? Fine to leave the docs if you think the slight semantic difference doesn't matter, but IMO we should still drop the now-unnecessary semantics in the page token itself (#96 (comment))? |
Clients often need to fetch recent entries without iterating over the entire keyspace. Ordering by a monotonic insertion counter lets callers retrieve the newest records first and stop early, which is not possible with lexicographic key ordering. A BIGSERIAL sort_order column is added to vss_db. New rows get a monotonically increasing value from its sequence, and list queries order by sort_order DESC. Because sort_order is UNIQUE, the page token collapses to a single integer with no tiebreaker needed. A composite index on (user_token, store_id, sort_order DESC) INCLUDE (key, version) keeps list queries as index-only scans. Pre-existing rows receive sequence values in heap-scan order during the column rewrite, so their list ordering will not reflect creation time; new rows onward do. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4c0b480 to
e7566b6
Compare
|
Okay removed version number from page token |
tankyleo
left a comment
There was a problem hiding this comment.
I ran this through gpt-5.5 xhigh, and it found these two things, sorry I didn't do this earlier.
- Medium: next_page_token now exposes the raw global sort_order. Since sort_order is a global BIGSERIAL UNIQUE (rust/impls/src/migrations.rs:38) and the token is just sort_order.to_string() (rust/impls/src/postgres_store.rs:40), clients can infer service-wide write volume and gaps caused by other
tenants. For a multi-user storage service, make the token opaque or use a per-user/store ordering value. - Low: negative page_size is still unvalidated, and this commit makes page_size = -1 silently return an empty successful page because fetch_limit becomes 0 (rust/impls/src/postgres_store.rs:688). Other negative values still become PostgreSQL limit errors. Reject page_size < 0 with
InvalidRequestError before computing limit.
Curious what you think of the first one here, it's a good point I find.
I had similar thought but I don't really seem the harm. Maybe you can infer something about user activity but not really the end of the world. Doing a per user token doesn't totally work because the big serial is across the whole table, we could encrypt it or something but it is nice that the user can easily compare between 2 tokens |
tankyleo
left a comment
There was a problem hiding this comment.
Sounds good thanks again for the PR
Clients often need to fetch recent payments or entries without iterating over the entire keyspace. Ordering by created_at lets callers retrieve the newest records first and stop early, which is not possible with lexicographic key ordering.
The page token now encodes (created_at, key) so the cursor remains unique even when multiple rows share the same timestamp. A composite index on (user_token, store_id, created_at, key) keeps the query efficient, and a migration back-fills any NULL created_at values and adds the NOT NULL constraint.